AITopics | cluster 0

Collaborating Authors

cluster 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explainable AI for Curie Temperature Prediction in Magnetic Materials

Ajaib, M. Adeel, Nasir, Fariha, Rehman, Abdul

arXiv.org Artificial IntelligenceNov-20-2025

Traditional approaches based on quantum mechanical computations or empirical models are often limited in scalability and accuracy. In recent years, machine learning (ML) has emerged as a promising alternative for property prediction across materials science domains [1-9]. Building on this momentum, several recent studies have proposed the use of ML models trained on curated magnetic datasets. In particular, the recent study [10] introduced the NE-MAD database, which aggregates experimentally measured magnetic transition temperatures and compositions. Similarly, the study by [11] utilized two of the largest available datasets of experimental Curie temperatures--comprising over 2,500 materials for training and more than 3,000 entries for validation--to compare machine learning strategies for predicting Curie temperature solely from chemical composition. Our work is inspired by these prior efforts and aims to improve the predictive accuracy and gain insights into model in-terpretability. We develop a pipeline that starts from the NE-MAD dataset, augments it with compositional and elemental features, and evaluates several ML models. A key contribution of our work is the integration of explainable AI (XAI) through SHAP (SHapley Additive exPlanations) analysis, which allows us to quantify how each input feature contributes to the model's prediction. Moreover, we benchmark our models on external datasets from literature to demonstrate generalization.

artificial intelligence, curie temperature, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.06996

Country: North America > United States (0.29)

Genre: Research Report (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Add feedback

AttentiveGRUAE: An Attention-Based GRU Autoencoder for Temporal Clustering and Behavioral Characterization of Depression from Wearable Data

Soley, Nidhi, Patel, Vishal M, Taylor, Casey O

arXiv.org Artificial IntelligenceNov-17-2025

In this study, we present AttentiveGRUAE, a novel attention-based gated recurrent unit (GRU) autoencoder designed for temporal clustering and prediction of outcome from longitudinal wearable data. Our model jointly optimizes three objectives: (1) learning a compact latent representation of daily behavioral features via sequence reconstruction, (2) predicting end-of-period depression rate through a binary classification head, and (3) identifying behavioral subtypes through Gaussian Mixture Model (GMM) based soft clustering of learned embeddings. We evaluate AttentiveGRUAE on longitudinal sleep data from 372 participants (GLOBEM 2018-2019), and it demonstrates superior performance over baseline clustering, domain-aligned self-supervised, and ablated models in both clustering quality (silhouette score = 0.70 vs 0.32-0.70) and depression classification (AUC = 0.74 vs 0.50-0.67). Additionally, external validation on cross-year cohorts from 332 participants (GLOBEM 2020-2021) confirms cluster reproducibility (silhouette score = 0.63, AUC = 0.61) and stability. We further perform subtype analysis and visualize temporal attention, which highlights sleep-related differences between clusters and identifies salient time windows that align with changes in sleep regularity, yielding clinically interpretable explanations of risk.

artificial intelligence, depression, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.02558

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)

Add feedback

User Profiles of Sleep Disorder Sufferers: Towards Explainable Clustering and Differential Variable Analysis

Sellami, Sifeddine, Agoun, Juba, Yessad, Lamia, Bounia, Louenas

arXiv.org Artificial IntelligenceOct-21-2025

Sleep disorders have a major impact on patients' health and quality of life, but their diagnosis remains complex due to the diversity of symptoms. Today, technological advances, combined with medical data analysis, are opening new perspectives for a better understanding of these disorders. In particular, explainable artificial intelligence (XAI) aims to make AI model decisions understandable and interpretable for users. In this study, we propose a clustering-based method to group patients according to different sleep disorder profiles. By integrating an explainable approach, we identify the key factors influencing these pathologies. An experiment on anonymized real data illustrates the effectiveness and relevance of our approach.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.15986

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.91)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.96)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.55)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.34)

Add feedback

Clustering Discourses: Racial Biases in Short Stories about Women Generated by Large Language Models

Bonil, Gustavo, Gondim, João, Santos, Marina dos, Hashiguti, Simone, Maia, Helena, Silva, Nadia, Pedrini, Helio, Avila, Sandra

arXiv.org Artificial IntelligenceSep-4-2025

This study investigates how large language models, in particular LLaMA 3.2-3B, construct narratives about Black and white women in short stories generated in Portuguese. From 2100 texts, we applied computational methods to group semantically similar stories, allowing a selection for qualitative analysis. Three main discursive representations emerge: social overcoming, ancestral mythification and subjective self-realization. The analysis uncovers how grammatically coherent, seemingly neutral texts materialize a crystallized, colo-nially structured framing of the female body, reinforcing historical inequalities. The study proposes an integrated approach, that combines machine learning techniques with qualitative, manual discourse analysis.

large language model, machine learning, short story, (20 more...)

arXiv.org Artificial Intelligence

2509.02834

Country: South America > Brazil (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Law > Civil Rights & Constitutional Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Contextual Phenotyping of Pediatric Sepsis Cohort Using Large Language Models

Nagori, Aditya, Gautam, Ayush, Wiens, Matthew O., Nguyen, Vuong, Mugisha, Nathan Kenya, Kabakyenga, Jerome, Kissoon, Niranjan, Ansermino, John Mark, Kamaleswaran, Rishikesan

arXiv.org Artificial IntelligenceAug-5-2025

Clustering patient subgroups is essential for personalized care and efficient resource use. Traditional clustering methods struggle with high-dimensional, heterogeneous healthcare data and lack contextual understanding. This study evaluates Large Language Model (LLM) based clustering against classical methods using a pediatric sepsis dataset from a low-income country (LIC), containing 2,686 records with 28 numerical and 119 categorical variables. Patient records were serialized into text with and without a clustering objective. Embeddings were generated using quantized LLAMA 3.1 8B, DeepSeek-R1-Distill-Llama-8B with low-rank adaptation(LoRA), and Stella-En-400M-V5 models. K-means clustering was applied to these embeddings. Classical comparisons included K-Medoids clustering on UMAP and FAMD-reduced mixed data. Silhouette scores and statistical tests evaluated cluster quality and distinctiveness. Stella-En-400M-V5 achieved the highest Silhouette Score (0.86). LLAMA 3.1 8B with the clustering objective performed better with higher number of clusters, identifying subgroups with distinct nutritional, clinical, and socioeconomic profiles. LLM-based methods outperformed classical techniques by capturing richer context and prioritizing key features. These results highlight potential of LLMs for contextual phenotyping and informed decision-making in resource-limited settings.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.09805

Country:

North America > United States (0.46)
Africa > Uganda (0.29)
North America > Canada > British Columbia (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparing Self-Disclosure Themes and Semantics to a Human, a Robot, and a Disembodied Agent

Chiang, Sophie, Laban, Guy, Cross, Emily S., Gunes, Hatice

arXiv.org Artificial IntelligenceApr-10-2025

As social robots and other artificial agents become more conversationally capable, it is important to understand whether the content and meaning of self-disclosure towards these agents changes depending on the agent's embodiment. In this study, we analysed conversational data from three controlled experiments in which participants self-disclosed to a human, a humanoid social robot, and a disembodied conversational agent. Using sentence embeddings and clustering, we identified themes in participants' disclosures, which were then labelled and explained by a large language model. We subsequently assessed whether these themes and the underlying semantic structure of the disclosures varied by agent embodiment. Our findings reveal strong consistency: thematic distributions did not significantly differ across embodiments, and semantic similarity analyses showed that disclosures were expressed in highly comparable ways. These results suggest that while embodiment may influence human behaviour in human-robot and human-agent interactions, people tend to maintain a consistent thematic focus and semantic structure in their disclosures, whether speaking to humans or artificial interlocutors.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2504.06374

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Add feedback

Scalable Robust Bayesian Co-Clustering with Compositional ELBOs

Vinod, Ashwin, Bajaj, Chandrajit

arXiv.org Machine LearningApr-8-2025

Co-clustering exploits the duality of instances and features to simultaneously uncover meaningful groups in both dimensions, often outperforming traditional clustering in high-dimensional or sparse data settings. Although recent deep learning approaches successfully integrate feature learning and cluster assignment, they remain susceptible to noise and can suffer from posterior collapse within standard autoencoders. In this paper, we present the first fully variational Co-clustering framework that directly learns row and column clusters in the latent space, leveraging a doubly reparameterized ELBO to improve gradient signal-to-noise separation. Our unsupervised model integrates a Variational Deep Embedding with a Gaussian Mixture Model (GMM) prior for both instances and features, providing a built-in clustering mechanism that naturally aligns latent modes with row and column clusters. Furthermore, our regularized end-to-end noise learning Compositional ELBO architecture jointly reconstructs the data while regularizing against noise through the KL divergence, thus gracefully handling corrupted or missing inputs in a single training pipeline. To counteract posterior collapse, we introduce a scale modification that increases the encoder's latent means only in the reconstruction pathway, preserving richer latent representations without inflating the KL term. Finally, a mutual information-based cross-loss ensures coherent co-clustering of rows and columns. Empirical results on diverse real-world datasets from multiple modalities, numerical, textual, and image-based, demonstrate that our method not only preserves the advantages of prior Co-clustering approaches but also exceeds them in accuracy and robustness, particularly in high-dimensional or noisy settings.

artificial intelligence, machine learning, row and column, (18 more...)

arXiv.org Machine Learning

2504.04079

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Wisconsin (0.04)
Europe > Germany > Bavaria > Regensburg (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.94)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Enhancing Machine Learning Performance through Intelligent Data Quality Assessment: An Unsupervised Data-centric Framework

Rahal, Manal, Ahmed, Bestoun S., Szabados, Gergely, Fornstedt, Torgny, Samuelsson, Jorgen

arXiv.org Machine LearningFeb-18-2025

Poor data quality limits the advantageous power of Machine Learning (ML) and weakens high-performing ML software systems. Nowadays, data are more prone to the risk of poor quality due to their increasing volume and complexity. Therefore, tedious and time-consuming work goes into data preparation and improvement before moving further in the ML pipeline. To address this challenge, we propose an intelligent data-centric evaluation framework that can identify high-quality data and improve the performance of an ML system. The proposed framework combines the curation of quality measurements and unsupervised learning to distinguish high- and low-quality data. The framework is designed to integrate flexible and general-purpose methods so that it is deployed in various domains and applications. To validate the outcomes of the designed framework, we implemented it in a real-world use case from the field of analytical chemistry, where it is tested on three datasets of anti-sense oligonucleotides. A domain expert is consulted to identify the relevant quality measurements and evaluate the outcomes of the framework. The results show that the quality-centric data evaluation framework identifies the characteristics of high-quality data that guide the conduct of efficient laboratory experiments and consequently improve the performance of the ML system.

characteristic, dataset, quality measurement, (14 more...)

arXiv.org Machine Learning

2502.13198

Country:

Europe > Sweden > Värmland County > Karlstad (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Fuel Efficiency Analysis of the Public Transportation System Based on the Gaussian Mixture Model Clustering

Ma, Zhipeng, Jørgensen, Bo Nørregaard, Ma, Zheng

arXiv.org Artificial IntelligenceJan-21-2025

Public transportation is a major source of greenhouse gas emissions, highlighting the need to improve bus fuel efficiency. Clustering algorithms assist in analyzing fuel efficiency by grouping data into clusters, but irrelevant features may complicate the analysis and choosing the optimal number of clusters remains a challenging task. Therefore, this paper employs the Gaussian mixture models to cluster the solo fuel-efficiency dataset. Moreover, an integration method that combines the Silhouette index, Calinski-Harabasz index, and Davies-Bouldin index is developed to select the optimal cluster numbers. A dataset with 4006 bus trips in North Jutland, Denmark is utilized as the case study. Trips are first split into three groups, then one group is divided further, resulting in four categories: extreme, normal, low, and extremely low fuel efficiency. A preliminary study using visualization analysis is conducted to investigate how driving behaviors and route conditions affect fuel efficiency. The results indicate that both individual driving habits and route characteristics have a significant influence on fuel efficiency.

artificial intelligence, fuel efficiency, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-73500-4_18

2501.12429

Country:

Europe > Denmark > North Jutland (0.25)
Europe > Poland (0.04)
Europe > Germany (0.04)
Europe > Denmark > Southern Denmark (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.73)

Add feedback

Cluster Catch Digraphs with the Nearest Neighbor Distance

Shi, Rui, Billor, Nedret, Ceyhan, Elvan

arXiv.org Machine LearningJan-9-2025

We introduce a new method for clustering based on Cluster Catch Digraphs (CCDs). The new method addresses the limitations of RK-CCDs by employing a new variant of spatial randomness test that employs the nearest neighbor distance (NND) instead of the Ripley's K function used by RK-CCDs. We conduct a comprehensive Monte Carlo analysis to assess the performance of our method, considering factors such as dimensionality, data set size, number of clusters, cluster volumes, and inter-cluster distance. Our method is particularly effective for high-dimensional data sets, comparable to or outperforming KS-CCDs and RK-CCDs that rely on a KS-type statistic or the Ripley's K function. We also evaluate our methods using real and complex data sets, comparing them to well-known clustering methods. Again, our methods exhibit competitive performance, producing high-quality clusters with desirable properties. Keywords: Graph-based clustering, Cluster catch digraphs, High-dimensional data, The nearest neighbor distance, Spatial randomness test

artificial intelligence, cluster 0, machine learning, (16 more...)

arXiv.org Machine Learning

2501.06268

Country: North America > United States > New York (0.15)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback